Classification of Filipino Speech Rhythm Using Computational and Perceptual Approach
نویسندگان
چکیده
This study incorporates computational and perceptual methods to classify Filipino speech rhythm. Speech rhythm may be described as a language‟s distinguishing durational sound pattern, resulting from the complexity of the language‟s syllable inventory. 1 Computational methods involve the correlation of rhythm-types to acoustic features such as the vocalic and consonantal intervals, one of which is the implementation of Multivariate Discriminant Analysis (MDA). Perceptual methods involve contrasting the rhythm of an unclassified language from prototype syllable-timed and stress-timed sentences. In order to isolate rhythm from speech, a data-stripping technique called flat sasasa resynthesis was implemented wherein the consonants are replaced with /s/ and vowels with /a/, producing a resynthesized alternating “sasasa” sounds at a constant pitch (F0). The rhythm discrimination and classification were closely examined for consistency between the data modeling and listening test results. The computational experiment was able to show that an MDA classifier trained to distinguish English and Japanese sentences tend to label Filipino sentences as Japanese 67% of the time, vis-à-vis the perceptual experiment showing that the listeners perceive Filipino to be more similar with Japanese, this study shows computational and perceptual validation that Filipino is syllable-timed, just like Japanese.
منابع مشابه
Explicit duration modelling in HMM-based speech synthesis using a hybrid hidden Markov model-multilayer perceptron
In HMM-based speech synthesis, it is important to correctly model duration because it has a significant effect on the perceptual quality of speech, such as rhythm. For this reason, hidden semi-Markov model (HSMM) is commonly used to explicitly model duration instead of using the implicit state duration model of HMM through its transition probabilities. The cost of using HSMM to improve duration...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملAn empirical study of the perception of language rhythm
Linguists have traditionally classified languages into three rhythm classes, namely stress-timed, syllable-timed and mora-timed languages. However, this classification has remained controversial for various reasons: the search for reliable acoustic cues to the different rhythm types has long remained elusive; some languages are claimed to belong to none of the three classes; and no perceptual s...
متن کاملTesting the Perception of Speech Rhythm on
It has long been assumed that the stress-timed vs. syllabletimed dichotomy is based on perceptual impressions of speech rhythm. However, experimental tests are rare in the literature and not all of them have successfully found perceptual evidence for speech rhythm categorization. Experimental protocols have been very different, sometimes testing naïve vs. non-naïve listeners, sometimes using na...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011